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Amdt. dated 01 October 2007 

Reply to Office action dated 27 June 2007 

Amendments to the Claims 

This listing of claims will replace all prior versions, and listings, of claims in the 
application: 

Listing of Claims: 

1 . (currently amended) A method for balancing the load of a parallel processing system 
having a plurality of parallel processing elements arranged in a loop, wherein each processing 
element (PE r ) has a local number of tasks associated therewith, wherein r represents the number 
for a selected processing element, and wherein each of said processing elements is operable to 
communicate with a clockwise adjacent processing element and with an anti-clockwise adjacent 
processing element, the method comprising: 

determining a total number of tasks present within said loop; 
calculating a local mean number of tasks for each of said plurality of processing 
elements; 

calculating a local deviation from said local mean number for each of said plurality of 
processing elements; 

determining a running partial deviation sum for each of said plurality of processing 
elements using said local deviation ; 

determining a clockwise transfer parameter and an anti-clockwise transfer parameter for 
each of said plurality of processing elements from said running partial deviation sums ; and 

redistributing tasks among said plurality of processing elements in response to said 
clockwise transfer parameter and said anti-clockwise parameter for each of said plurality of 
processing elements. 

2. (original) The method of claim 1 wherein said determining a total number of tasks 
present within said loop, comprises: 

transmitting said local number of tasks associated with each of said plurality of 
processing elements to each other of said plurality of processing elements within said loop; 
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receiving within each of said plurality of processing elements said number of local tasks 
associated with said each other of said plurality of processing elements; and 

summing said number of local tasks associated with each of said plurality of processing 
elements with said number of local tasks associated with each other of said plurality of 
processing elements. 

3. (original) The method of claim 1 wherein said determining said total number of tasks 
present within said loop includes solving the equation V = J]v i f , where N represents the number 

;=0 

of said processing elements in said loop and v/ represents said local number of tasks associated 
with an i th processing element in said loop. 

4. (currently amended) The method of claim 1 wherein said calculating a local mean 
number of tasks within each of said plurality of processing elements includes solving the 
equation M r = Trunc((V + E r )/N), where M r is said local mean for said PE r , N is the total 
number of said processing elements in said loop, and E r is a number in the range of 0 to {N-\\ 
V is the total number of tasks, and wherein each processing element has a different E r value. 

5. (currently amended) The method of claim 4 wherein controls said Trunc function is 
responsive to the value of Er such that said total number of tasks for said loop is equal to the sum 
of the local mean number of tasks for each of said plurality of processing elements in said loop. 

6. (currently amended) The method of claim 4 wherein said local mean 

M r = Trunc((V + E r ) / N) for each local PE r within said loop is equal to either one of X and or 

C*+i). 

7. (original) The method of claim 1 wherein said calculating a local deviation within each 
of said plurality of processing elements comprises finding the difference between said local 
number of tasks for each of said plurality of processing elements and said local mean number for 
each of said plurality of processing elements. 
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8. (currently amended) The method of claim 1 wherein said determining a running partial 
deviation sum for each of said plurality of processing elements comprises: 

transmitting said local deviation associated with each of said plurality of processing 
elements to an adjacent one another of said plurality of processing elements within said loop; 

receiving within each of said plurality of processing elements said local deviation 
associated with at least one other of said plurality of processing elements; and 

summing within each of said plurality of processing elements said local deviation 
associated with e ach of said plurality of proce s sing e lements with s aid local deviation as s ociated 
with at l e ast on e oth e r of said plurality of processing e l e ments and said received local deviation; 
and 

repeating said transmitting, receiving, and summing a predetermined number of times , 

9. (original) The method of claim 1 wherein said determining a running partial deviation 

sum for each of said plurality of processing elements comprises solving the equation Sj = ^ D i , 

<=o 

where Sj represents said running partial deviation sum, D, represents the local deviation 
associated with the i th processing element, and j ^ (N- 1) where N is the number of processing 
elements on said loop. 

10. (original) The method of claim 1 wherein said determining a clockwise transfer 
parameter and an anti-clockwise transfer parameter within each of said processing elements 
comprises: 

setting T a = {H r + L r ) + 2; and 

setting Tc^D-Ta where H r represents a highest extrema of said running partial 
deviation sum; L r represents a lowest extrema of said running partial deviation sum, D 
represents the local deviation of a selected local processing element; and T c represents 
said clockwise transfer parameter, and T a represents said anti-clockwise transfer 
parameter. 
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1 1 . (currently amended) A method for reassigning tasks associated with a selected 
processing element within a parallel processing system having a plurality of processing elements 
connected in a loop, each of said plurality of processing elements having a local number of tasks 
associated therewith, the method comprising: 

determining the total number of tasks on said loop; 

computing a local mean value for said selected processing element; 

computing a local deviation for said selected processing element, said local deviation 
representative of the difference between said local number of tasks for said selected processing 
element and said local mean value for said selected processing element; 

determining a running partial deviation sum for said selected processing element using 
said local deviation ; 

computing a number of tasks to transfer in a clockwise direction for said selected 
processing element from said running partial deviation sum ; 

computing a number of tasks to transfer in an anti-clockwise direction for said selected 
processing element from said running partial deviation sum ; and 

reassigning said tasks associated with said selected processing element relative to the said 
number of tasks to transfer in a clockwise direction and said number of task to transfer in an anti- 
clockwise direction. 

12. (original) The method of claim 1 1 wherein said determining the total number of tasks on 
said loop, comprises: 

receiving within said selected processing element said number of local tasks associated 
with said each other of said plurality of processing elements; and 

summing said number of local tasks associated with said selected processing element 
and said number of local tasks associated with each other of said plurality of processing 
elements. 

13. (currently amended) The method of claim 1 1 wherein computing a local mean value for 
a selected processing element includes solving the equation M r = Trunc((V + E r )l N) , where 
M r represents said local mean for a processing element PE r , /^represents the total number of 
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processing elements in said loop, Vis the total number of tasks, and E r is a number in the range 
ofOto (Af-1). 

14. (currently amended) The method of claim 13 wherein controls said Trunc function is 
responsive to the value of Er such that said total number of tasks for said loop is equal to the 
sum of the local mean number of tasks for each of said plurality of processing elements in said 
loop and wherein each processing element has a different E r value assigned. 

15. (currently amended) The method of claim 1 1 wherein said determining a running partial 
deviation sum for said selected processing element comprises: 

receiving within said selected processing element a local deviation associated with an 
adjacent one of at least on e other of said plurality of processing elements; and 

summing said local deviation associated with said selected processing element and said 
local deviation associated with at least one other of said plurality of processing elements. 

16. (original) The method of claim 1 1 wherein said determining a running partial deviation 
sum for said selected processing element comprises solving the equation Sj = ]T D i , where Sj 

represents said running partial deviation sum, D, represents the local deviation associated with an 
i processing element, and j £ (N- 1) where N is the number of said plurality of processing 
elements on said loop. 

17. (original) The method of claim 1 1 wherein said computing a local mean value, said 
computing a local deviation, said determining a running partial deviation sum, computing a 
number of tasks to transfer in a clockwise direction, computing a number of tasks to transfer in 
an anti-clockwise direction, and said reassigning tasks relative to the said number of task to 
transfer in a clockwise direction and said number of tasks to transfer in an anti-clockwise 
direction are completed simultaneously for each of said plurality of processing elements within 
said loop. 
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18. (original) The method of claim 1 1 wherein said computing a number of tasks to transfer 
in a clockwise direction for said selected processing element includes evaluating at least one of a 
maximum extrema and a minimum extrema of said running partial deviation sum. 

19. (original) The method of claim 1 1 wherein said computing a number of tasks to transfer 
in an anti-clockwise direction for said selected processing element includes evaluating at least 
one of a maximum extrema and a minimum extrema of said running partial deviation sum. 

20. (currently amended) A computer readable memory device carrying a set of instructions 
which, when executed, perform a method comprising: 

determining a total number of tasks present within said loop; 
calculating a local mean number of tasks for each of said plurality of processing 
elements; 

calculating a local deviation from said local mean number for each of said plurality of 
processing elements; 

determining a running partial deviation sum for each of said plurality of processing 
elements using said local deviation ; 

determining a clockwise transfer parameter and an anti-clockwise transfer parameter for 
each of said plurality of processing elements from said running partial deviation sums ; and 

redistributing tasks among said plurality of processing elements in response to said 
clockwise transfer parameter and said anti-clockwise parameter for each of said plurality of 
processing elements. 
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